MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets

نویسندگان

چکیده

Small molecule lipophilicity is often included in generalized rules for medicinal chemistry. These aim to reduce time, effort, costs, and attrition rates drug discovery, allowing the rejection or prioritization of compounds without need synthesis testing. The availability high quality, abundant training data machine learning methods can be a major limiting factor building effective property predictors. We utilize transfer techniques get around this problem, first on large amount low accuracy predicted logP values before finally tuning our model using small, accurate dataset 244 druglike create MRlogP, neural network-based predictor capable outperforming state art freely available prediction small molecules. MRlogP achieves an average root mean squared error 0.988 0.715 against molecules from Reaxys PHYSPROP. have made trained network all associated code descriptor generation available. In addition, may used online via web interface.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning using Large Datasets

This contribution develops a theoretical framework that takes into account the effect of approximate optimization on learning algorithms. The analysis shows distinct tradeoffs for the case of small-scale and large-scale learning problems. Small-scale learning problems are subject to the usual approximation– estimation tradeoff. Large-scale learning problems are subject to a qualitatively differ...

متن کامل

Learning Singly-Recursive Relations from Small Datasets

The inductive logic programming system LOPSTER was created to demonstrate the advantage of basing induction on logical implication rather than -subsumption. LOPSTER's sub-uni cation procedures allow it to induce recursive relations using a minimum number of examples, whereas inductive logic programming algorithms based on -subsumption require many more examples to solve induction tasks. However...

متن کامل

A Transfer Learning Approach for Applying Matrix Factorization to Small ITS Datasets

Machine Learning methods for Performance Prediction in Intelligent Tutoring Systems (ITS) have proven their efficacy; specific methods, e.g. Matrix Factorization (MF), however suffer from the lack of available information about new tasks or new students. In this paper we show how this problem could be solved by applying Transfer Learning (TL), i.e. combining similar but not equal datasets to tr...

متن کامل

Pre-Processing-Free Gear Fault Diagnosis Using Small Datasets with Deep Convolutional Neural Network-Based Transfer Learning

Early fault diagnosis in complex mechanical systems such as gearbox has always been a great challenge, even with the recent development in deep neural networks. The performance of a classic fault diagnosis system predominantly depends on the features extracted and the classifier subsequently applied. Although a large number of attempts have been made regarding feature extraction techniques, the...

متن کامل

Transfer Learning Using Experimental State Splitting and Image Schemas

Jean is a model of early cognitive development based loosely on Piaget’s theory of sensori-motor and preoperational thought (Piaget 1954). Like an infant, Jean repeatedly executes schemas, gradually extending its schemas to accommodate new experiences. Jean’s environment is a simulated “playpen” in which Jean and other objects move about and interact. Jean’s cognitive development depends on sev...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Processes

سال: 2021

ISSN: ['2227-9717']

DOI: https://doi.org/10.3390/pr9112029